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A Generalised Theory of the Combination of Observations 
so as to Obtain the Best Result. 

By Simon Newcomb. 



§ 1. Introductory Bemarks. 

The accepted practice of combining observations rests upon the hypothesis 
that the frequency of errors follows a certain well-known law which may be 
expressed as follows : Let A be the amount by which the result of an observa- 
tion may differ from the value obtained by taking the mean result of an 
infinity of similar observations. A will then be the error of the observation. 
The infinitesimal probability that an error will be contained between the limits A 
and A + cZA is supposed to be given by the equation 

in which h is the "modulus of precision" depending upon the accuracy of the 

observations, and e is the Naperian base. 

The modulus h is commonly replaced by a probable error r, which term 

signifies such a magnitude that the number of errors less than r in absolute 

value is equal to the number which exceed r. The value of r is given in terms 

of the modulus 7t by the equation 

0.4769 

When the errors really follow the law in question, they diminish with 
extreme rapidity as A increases. For example, only one per cent, of the errors 
should fall without the limits d= ir. 

As a matter of fact, however, the cases are quite exceptional in which the 
errors are found to really follow the law. The general rule is that much more 
than one per cent, of the errors exceed four times the probable error. In other 
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words, it is nearly always found that some of the outstanding errors seem 
abnormally large. The method of dealing with these abnormal errors has always 
been one of the most difficult questions in the treatment of observations. The 
common practice has been to consider the observations affected by them as 
abnormal, and to reject them in obtaining the final result. But we here meet 
with the difficulty that no positive criterion for determining whether an observa- 
tion should or should not be treated as abnormal is possible. Several attempts 
have indeed been made to formulate such a criterion, the best known of which is 
that of Peirce.* 

Peirce's criterion has always seemed to me subject to two serious objec- 
tions. One is that it takes no account of any probable error of the observations 
under consideration which may be known beforehand, but proceeds as if the 
value of the probable error could be deduced from the comparison of the 
observations inter se. An immediate general consequence of this is that if all 
the errors of a system are multiplied by the same factor, the same observations 
are rejected as before, how small or great soever the factor may be.f 

The second objection is that it takes no account of the fact that the a priori 
probability that an observer should make an abnormal observation varies with 
the observer, and places all observers on a level by regarding that probability as 
determined by a genei-al mathematical principle applicable to all cases. 

It is, however, well known that some observers make very few abnormal 
observations, while others are extremely liable to them. It is evident that if we 
are dealing with an observation whose error is so large that we doubt whether 
it should or should not be considered abnormal, our judgment must depend very 
largely upon any knowledge we may have of the carefulness of the observer. 

The fact is, however, that any system of rejecting supposed abnormal 
observations is subject to the objection of leading to a result which is a discon- 
tinuous function of the separate errors of observation, and hence to results 

* Gould's Astronomical Journal, Vol. II, p. 161. 

t Certain results of Peirce's criterion in special cases, ■'ifb.en applied to sets of three or four observa- 
tions, do not seem to have been hitherto noticed. The following are cases in point : 

Of a set of three observations none are ever I'ejected by it, no matter how much one may deviate 
from the mean of the other two. 

In a set of four observations, if three agree exactly, the fourth will always be r^ected if it differs 
ever so little from the others. More generally, if no one of the three results which agree best among 
themselves differs from the mean of the three by more than £, then a fourth, which differs from that 
mean by more than 4e , will be rejected. For example, if the results of four observations with a meridian 
circle were 0". 3 ; 0". 4 ; 0". 5 ; 0". 8 , the last v^ould be rejected. 
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which are sometimes indeterminate. Suppose, for example, that we are dealing 
with the mean of three observations, two of which are closely accordant, while 
the third differs from the mean of the other two by the quantity x. Let us 
represent the mean of the two accordant ones by the symbol m! ; then, if we 
include the discordant observation, the general expression for the mean result 
in terms of x will be 

}« = m' + i a; . 

In ordinary astronomical practice we retain this value of m so long as x 
does not exceed the limit which we consider that of a normal error. But, as soon 
as this limit is reached, we drop x entirely and take m! for the value of m . In other 
words, if we consider x to increase from zero, the adopted value of m will increase 
one-third as fast until the assigned limit is reached, and will then suddenly spring 
back from m' -\- ^x to m'. If the critical point at which x is to be rejected 
could be satisfactorily defined, this course would be less objectionable. But, as 
a matter of fact, it is to be determined bj^ the judgment of the investigator, with 
the result that between certain wide limits the investigator must himself be 
doubtful whether he should take m' or m' + i a; as his result. Of course different 
investigators would reach different conclusions in special cases, and thus the most 
probable result is frequently indeterminate. 

There are classes of important observations in which the proportion of 
large errors is so great that no separation into normal and abnormal observa- 
tions is possible. This is the case in observations of transits of Venus and 
Mercury over the sun. A noteworthy instance has been given by the writer in his 
discussion of transits of Mercury.* By a comparison of 684 observations it was 
found that the errors of one-half of them were contained between the limits ± 6".8 . 
If the errors followed the commonly assumed law, then only 5 of them should 
have exceeded =t 27 seconds. As a matter of fact, however, it was found that 49 
exceeded these limits. Yet these 49 observations cannot be considered as wholly 
worthless, because their results are not scattered entirely at random, and ai"e mostly 
included between comparatively narrow limits. They differ from the other 
observations only in having a lai'ger probable error. 

The case may be made clearer by reflecting that the law in question pre- 
supposes that the observations under consideration are all of the same general 
quality as regards liability to error ; in other words, that they are all liable to 

* Astronomical Papers of the American Ephemeris, Vol. I, pp. 379-383. 
Vol. Vin. 
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the same errors, and differ only in the accidental circumstances which give rise 
to the errors. If, however, this is not true ; if, for example, we are furnished 
with a system of observations of which one portion have a small probable error, 
another a larger probable error, a third a yet larger one, and so on, then the 
errors of the whole system will not follow the law in question, but we shall find 
that large errors are disproportionately frequent. Now, this must be the ease in 
nearly all astronomical and physical work. 

From this another conclusion follows. In such a mixed system of observa- 
tions the most probable result will be, not the arithmetical mean, but a mean 
obtained by giving less weight to the more discordant observations. This will 
be evident on reflecting that in such a case the more discordant results will 
probably belong to the observations having a larger probable error and therefore 
the less weight. 

§ 2. Modified Curves of Prohahility. 

The preceding considerations lead us to the further conclusion that the 
commonly received theory which presupposes that there must always be some 
one " most probable value " of a quantity determined by observations, lacks 
generality. The fact is that, in special cases, owing to a possibility of abnormal 
observations, the curve of probability may have a great variety of forms. As 
one example, let it be supposed that two mean declinations of a star, determined 
with a good, meridian circle the micrometer-head of which is numbered at 
intervals of 5", differ from each other by a quantity approximating to 5". We 
then may make three hypotheses : that the observations are both normal, or that 
one or the other of them is in error by 5" through a mistake in recording. 

According to the probability of the first hypothesis, and of either of the 
other two, we may have different curves. Assuming the instrument and 
observer to be so accurate that a difference of 5" between two normal obser- 
vations is nearly out of the question, we shall have a curve of the form A. As 
the probability of the first hypothesis incr^ses, the curve may assume the 
form B. If the observer is one never known to make mistakes in reading, the 
curve will approximate to its usual form. 




Fig. a. Fig. B, 
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Now, it is evident that in such a case as that indicated by curve A, we can 
assign no one most probable value of the observed quantity. The only complete 
statement we could make would be embodied in a table showing for each 
separate possible value of the required quantity the probability that the quantity 
had a value not differing from it by more than a small arbitrary amount. 
Assuming the intervals of the table to be taken to equal double this amount, the 
sum of all these probabilities would be unity. 

Looking further into the matter, we see that this is a general method of 
expressing the conclusion to be derived in all cases from an observation or a series 
of observations. No matter how definite the primary value given by an observa- 
tion may be, the actual conclusion to be drawn is always a series of separate 
probabilities that the quantity observed has some one of an infinite series of 
values. If the law of probability is that commonly assumed, then the probability 
of each assignable value is completely determined when the most probable value 
and the probable error are given. But such is not the case when the law of 
probability deviates from this form. 

§ 3. Evil and Worth of Erroneous Results. 

The question now arises whether, when we consider the most general case, 
in which there may be several maxima of probability, and when, therefore, no 
one most probable value can be assigned, it is possible to formulate any general 
principle by which a single value shall be preferred for acceptance above all others. 
Taking as an example such a case as A just given, it is clear that no such 
principle is possible without some antecedent hypothesis determining a law of 
choice between errors of different magnitudes. If, to fix the iileas, we suppose 
that in case A the results of the two separate observations were 0". and 5". 0', then 
the three hypotheses will give us 0". ; 2". 5 ; 6". 0, as three values between which 
we are to choose. If we choose either the first or third, we shall have a proba- 
bility of slightly less than one-half of being very near the truth, and an equal 
probability of being 5" in error, together with a very small probability of being 
about 2". 5 in error. If, on the other hand, we take 2". 5 as our result, we shall 
be almost sure of being between 2" and 3" in error,, and no more. Our choice, 
then, must depend on whether a certainty of being 2". 5 in error, or an even 
chance of each of the errors 0" and 5", is preferable. This again turns upon the 
question whether an error of 5" involves more or less than twice the evil of an 
error of 2". 5 . 
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The ordinary requirements of practical life are in favor of the view that 
the evil increases in a higher ratio than the simple magnitude of the error. As 
examples, if it is a case of an error in the position of a ship arising from an 
error of the Nautical Almanac, we readily see that the probability of the error 
leading to a shipwreck increases in a higher ratio than that of the simple 
error itself. Again, in the case in which, by the labor of continually increasing 
observations of a single quantity, we lessen the probable error, we know that it 
requires fourfold labor to reduce the probable error to one-half. It would seem, 
therefore, that the best hypothesis that we can adopt is that the evil of an error 
is proportional to the square of its magnitude. 

A determination has more or less value according as it is less or more liable 
to errors. The simplest definition of the value of an observation that we can 
adopt is that it is inversely as the sum total of the evils to which it is subjected, 
each evil being multiplied by its probability. This also is in strict accord with 
the ordinary law of probability of a number of observations, since it involves the 
hypothesis that the value of a result is proportional to the number of observa- 
tions on which it depends. As, however, the word " value," if used to express 
this conception, would be ambiguous in consequence of being applied to the 
simple amount of a quantity, we shall use the term worth to express the economic 
value of an assigned value as just described. We therefore have the definitions : 

The evil of any value assigned to a quantity is equal to the sum of all the 
products obtained by multiplying the square of each possible error of that 
assigned value by the probability of its occurrence.* 

The worth of any such value is inversely proportioned to its evil. 

The value to which we are to give the preference is that whose worth is a 
maximum, or, which amounts to the same thing, whose evil is a minimum. This 
value, and the magnitude of the evil with which it is affected, will be two 
elements corresponding to "most probable value" and "probable error" in the 
usual theory. 

§4. Algebraic Expressions^ for the Evil. 

The general expression for these elements is readily obtained. Let us 
represent all possible values which the required quantity can have by the series 

Xi, Xij, Xg, . . . . x^ , 

a-nd let Pi,Pi,Ps, ■ • • -Pn 

* This idea of an evil attached to each error, and proportional to the square of its magnitude, is due 
to Gauss (Theoria Combinationis Observationum, etc. , Pars prior, |6) , who applies the term jactura to the 
conception here called evil. 
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be the several probabilities of these values ; v?e necessarily have in this case 

n 

1 

Putting X for any value of the quantity in question, the evil of this value 
will be, by definition, 

e=pi {x — x^^ + pi {x — Xi)^+ +Pn{x — x^y 

= iiG'—2Ax + B, 



where 



= / ,Fi^t' 



B='^p,xl 

1 
This quantity is a minimum when 

x= A, 

which is, therefore, the value we are to prefer. The evil of this value is 

e,=:B—A\ 
The respective values of A^ and B may be written, since Xpi = 1 , 



E 



PiPjXiXj, 

1 ''-^ 



5=\ PiPjX\. 



1 

The value of the minimum evil then becomes 

n 



eo = -g- ^ . ,19il9) {^i — ^jf- 



1 



It therefore appears that the inverse of this expression is the worth of the best 
value of the required quantity, which best value is giyen by the equation 



'o—^i^jPi^i- (1) 



i 
1 



In what precedes, the form of our equations is based upon the supposition 
that Xi, Xi . . . . Xn are a finite number of discrete values which x may have. In 
the usual case, however, the unknown quantity may have all values between 
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certain wide limits, and the probability that it is contained between the limits 
X and x-\-dx 18 given by an equation of the form 

dp = ^ (x) dx , 

dp being an infinitesimal probability. Since this is a pure number, it follows 
from this equation that, whatever physical quantity may be represented by x, 
^ (x) must be of the dimension — 1 in this quantity. 

Reducing the formula (l) to the present case, we find that the preferable 
value of X is given by the equation 

x^{x)dx. (2) 

OD 

The evil of this value is 

e, = -\f^Jf% {y) ■ 1> (2) (y — zf dydz (3) 

f^ydy—J J yz^{y)^{z)dydz. 

From the definition of evil, it is a quantity of the dimensions + 2 in those 
of the physical quantity in question. Its square root is therefore a definite 
quantity of the magnitude to be determined, which we may regard as an error. 

An example of the results of the present theory will be found by applying 
it to the case of the usually assumed law of error, namely : 

*(.)=±-.-.-. 

We then have a;o= as the preferable value of a;, in accordance with the usual 
theory. In the expression for the evil of this value we have 

J. = cCq = , 

This value of B is identical with the square of what is commonly called the 
mean error, which again is equal to 1 . 5 X probable error nearly. Hence, in this 
special case, the evil is identical with the square of the mean error. 

If, instead of taking zero as the value of a?, we wish to express the evil of 
any other assumed value, we have the expression 

e = a;* + 5 = g^ + a;*, 

e being the mean error. If, instead of e, we use r, the probable error, the 
expression will be 

e = a^+ 2.198/-^ 
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We readily find that if, instead of using the most probable value of a quantity, we 
adopt a value differing from it by its probable error, the evil will be increased 
by a little less than half its whole amount. 

It appears, therefore, that, whatever the law of error, we may always find 
two quantities corresponding to "most probable value," and "probable error" 
of that value in the usual theory. One of these quantities will naturally be 
the best value of the required magnitude, or A itself. The other may be either 
the evil of the value A, or the change in the value of tke magnitude required 
to increase its evil in a definite ratio, these last quantities being functions of the 
same quantity, and therefore of each other. If we present the result in the 
form x = A±i Vif — J.*, 

the last term will be the "mean error" in the usual theory, and the change in 
X which would double its evil in the generalized theory. If we wish to express 
the quantity corresponding to the "probable error," we write 

x—A± Q.Q'J4:WB — A\ 

§ 5. Modified Law of Probability. 

The whole problem now before us is reduced to finding a curve of proba- 
bility in the case of a number of observations of the same quantity. This, 
problem naturally involves that of the law of error of the separate observations, 
and leads us to inquire what modification should be made in the usually assumed 
law in order that it may be applicable to all cases whatever. 

The defect of the commonly assumed law, as represented by the equation 

is that, in practice, lai-gc errors are more frequent than this equation would 
indicate them to be. This defect might be remedied by substituting some other 
function of x than 1M as the exponent of e . The requirements of this function 
would be : 

1. That it should be an even function of x, or of the form/(a;^). 

2. That it should become infinite when x did. 

3. That it should increase less rapidly than 7i^x^, or, more exactly, that the 

9!/ 
second derivative -^ , instead of being a constant, should diminish with an increase 
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of X. Such a function can be formed by writing, instead of 1^, an expression 
of the form A^(l+/tV) 

1+A'V ' 

so that we should have, for the exponent /(a;^) , an expression of the form 

The management of such an exponent might, however, prove inconvenient, and 
I shall adopt a law of error founded on the very probable hypothesis that we 
are dealing with a mixture of observations having various measures of precision. 
Let us put 7<i, 7<2 . . . . A„, 

the possible separate values of the different measures of precision ; 

the corresponding probabilities that an observation selected at random has any 
one of these several measures of precision. 

Then, for an observation selected at random, the law of error will be 

^i^) = ^^ [pAe-'''''' + pAe-"^' + . • . • +PnKe-''>'\. (5) 

§ 6. Deduction of Best Result. 

If we have m observations of the same quantity giving Xi, x^ . . . . x^ as 
the observed values, then, assuming the law of error expressed in (5), the proba- 
bility that the quantity is contained between the limits yj and >? + c?*? is given by 
the equation dp = a^- (yi) dy; , 

in which 4'i,v) = ^{xi—ri)^{Xi — V!)....^{x^ — r!), (6) 

while a is a constant so chosen as to make 

4 ()?) <^>7 = 1 . 

00 

The formula (2) then gives for the best value^of x the expression 

/+« 

/ T^ (jy) dri 

*^ 00 

If 4' (>?) be multiplied by any constant factor, it will disappear from this expres- 
sion ; we may therefore disregard all such factors in forming i^/ (>?) . We may 
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therefore take for i^ (>?) the product of the m quantities 

X 



(8) 



in which we have written, for brevity, 

h' = hp. 

This product will be formed of w™ terms, each of the form 

where 

P = Mhjhi K, 

Ji?=hl + h] + hl + +hl, 

b = ^faji + A^jcg + Af a-j +.... + ^^ic^ , 

G = hlsi -jt- hjai + h\i4 + .... + hl^m, 

i, J, I .... r being any tn of the n indices 1, 2, S . . . .n, with repetition. We 

therefore write 

4 (>7) = 2Pe-*''''+ «*"-". (9) 

We then have, by integration, 

J^\(„)d„ = ^?^e'i-', (10) 

If we distinguish the /i™ values of each of the quantities P., h, b and c by the 
suffixes 1, 2, 3 .... Z, so that 1-=.%'^, and if we put, for perspicuity, 

_ ^* 

'?< — U' 

P ^^ „ (11) 

Wt—~e^r"' (t=l, 2, 3 ^, 

«'« 
the equation (7) will give 

Wi + Wa + .... + W/ • 

This result admits of a statement which will make the principles of the method 
quite' clear, independently of the analytic processes. The «" quantities rji, yj^, 
etc., are so many means by weights of the observed quantities xi, x^j.x^ . . . . x^, 

VOI-.VIII, 
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each mean being obtained by making an hypothesis respecting the distribution 
of the measures of precision h^, h^ . . . . h^ among the m separate observations. 
Since each observation may, independently of all the others, have any one of the 
n measures of precision, there vpill be «*" such hypotheses, each leading to a different 
mean, ri . The final value of x is again a mean by weights of the results of the 
different hypotheses, the weight of each result being proportional to the proba- 
bility of the hypothesis on which it depends, which is represented by w. This 

P . 

probability is a product of two factors, of which one, — , is proportional to the 

probability of the combination, while the other, e*' ", is the probability of the 
combination of outstanding errors to which the hypothesis leads. 

§ 7. Application to an Example. 

Before showing how the preceding method may be simplified in practice, it 
may be of interest to give a simple numerical example of its rigorous applica- 
tion. Let it be granted that we have three observations of a class for which there 
is a probability of f that an observation is good, and of i that it is poor. Let the 
measure of precision of a good observation be 4, and of a poor one 1. Let the 
results of the three observations be 

I, 0; II, 0; III, 1. 

Since we have, a priori, no reason to distinguish between these results, the usual 
method of treatment would lead either to i as the best result, or to the rejection 
of the third observation, and hence to the result . 

From the point of view of the present paper, the agreement of the first 
two observations and the discordance of the third give color to the hypothesis 
that the first two observations are good and the third poor. On this hypothesis 
the best result would be it, the weights of the results being 16, 16 and 1. But, 
since every other hypothesis we can make would lead to a larger result, the best 
result must be greater than this. 

The rigorous treatment of the problem gives 

*(=«) = 3^^<=-- +*-"}• 
Hence, when a>i ^ tCg = and ajg = 1 , 

■^{n) = 512e-*»'''+^''-" + Qie-^sn^+^^-i 
+ 1286-^'"'+'^''-" -f i6e-^»'''+^''-i 
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These six terms correspond respectively to the six essentially different hypotheses 
which can be made respecting the distribution of the measures of precision among 
the different observations, it being observed that among the eight hypotheses 
there are two pairs such that the members of each pair lead to identical results. 
The results are tabulated as follows : 

Hypothesis. 



I, II and III all good, 

I and II bad ; III good, 
I and II good ; III bad, 

I, II and III all bad. 

Hence, for the value of maximum worth, we have 

»= 0.0944. 

§8. Modification of the Method when the Observations are Numerous. 

In order to apply the preceding method, it is necessary to know the respect- 
ive probabilities that the measure of precision of any one observation has the 
several values hi,\....h^. These probabilities are determined from the actual 
distribution of the residuals with respect to magnitude as found by the study of 
large masses of observations. If it were found that in any class of observations 
the magnitudes of the residuals followed the comm6nly assumed law, we should 
have but one value of h. If, fts will commonly ^ the case, we find a larger 
number of large residuals than would be given by the common theory, we assume 
one, two, three or more additional values of h, and determine how many obser- 
vations we must assign to each class in order that the distribution . may be 
represented by an equation of the form (5). 

To carry out the rigorous process of finding the best mean value of a, we 
should form, by the equations (8) and (11), w™ different values of P, h, b, c, ^ 
and w, and thence, from (12), the required value of x. 





Besalts. 


Probability. 


Product. 


b 




p 6« 




¥ 




h ' 




1 
3 


= 0.3333 


0.003 


0.001 


16 
33 


= 0.4848 


0.006 


0.003 


8 
9 


= 0.8889 


0.318 


0.283 


1 
33 


= 0.0S03 


4.224 


0.128 


1 

18 


= 0.0555 


1.467 


0.082 


1 
3 


= 0,3333 


0.296 


0.099 




S = 6.314 


0.596 
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To effect this, let us attach, or suppose to be attached, to each of the quan- 
tities P, k, etc., a sjstem of m indices, i,j, h, etc., each index taking the values 
1, 2 .... «. The system of indices 

i,j, h . . . .q 

attached to a quantity will then indicate the special value of that quantity which 
results from assigning 

to jci the precision Aj, 

uO CCg tlj J 

to ajg " " Aj, 



tox„." " \. 



Moreover, we shall, for brevity, represent the combination of indices 

{i,j,h q) 

by the single symbol t. 

Any one value of w in (11) may then be written in the form 

where we put, for brevity, 

A = ¥e—h\ 

If we here substitute for ¥,h and c their values from (8), we find that this 
expression reduces to 

(» = 1 , 2, 3 wi — 1 ; j = i-\- 1, i+2 m), 

where h^ for the moment indicates the special value of h which, in any combina- 
tion, is assigned to Xi . We may equally represent A in the form 

-j'^^himxi-xjf. 

« = 1 ; = 1 

terms ; the second of m' terms, if we 

count those which vanish through *=/. 

Since A depends only on the differences between the x\ it will remain 
unchanged when we subtract any constant from all of them. Let us then 
subtract yj from all of them, putting 



The first form will consist of ^^^^"- — ~ terms ; the second of nv' terms, if 
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we shall then have, for each combination of indices, 

In strictness, each of the values of |{ should be affected by the m indices 
i, J, 7c .... r, because the values of)? are so affected. But, when m is large, the 
different values of >? arising from different combinations of indices will differ 
but slightly in the large majority of combinations. We may therefore, to form 
the m £'s, take one general mean value of v; for all the combinations of the 
indices. 

We have now to substitute the above value of A in the exponential expres- 
sion for w . The result may be expressed in the following form. If we put 

wP = hie-'^'A 

wf> = A,'e-»?fi (*•= 1 , 2, 3 n), 

wf' = ^,'e-*?f- (13) 

(in which, it will be noted, the suflSx i has its original signification), we shall 
have w^ij, ic....r)= w^w/^w^^' '^t'^^K,j,ic....r), 

We have now to introduce another approximate hypothesis, namely, that 
the various values of k are so nearly equal that they may be regarded as having 
a common value. We note that 7t? is the sum of the squares of the m h% and 
that, in the large majority of cases, the sum will approximate to a certain mean 
value, found by distributing the m values among the n classes proportionally to 
their several probabilities. 

If we now substitute the preceding values of w and v; in (12) we find, on 
the above hypothesis, that the value of x may be reduced to the form 

_ WiXi-{- W^,2+ WsXs-\-. . . . + W„,Xm 



where 



it/ 


w,-\-w, + w,+.... + w„ ' 


w. 


= hM^ + hM^ + hM^ +....+ AiwJJ' -^ Xw?\ 


w. 


= ^wf + hlwr + Mwf^ +.... -b^iw<?> -^ 2wf , 



(14) 



W^ = Afwi""+ 74w<'»'+ hl-w'r'+ + A^wf >-^ 2wr. 

These vaiues of Wsive the modified weights of the several results x^jX^ x^ 

arising from the probable variability of h from one observation to another. Were 
there no such variability; were there but one value of 7i, then these weights 
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would all be equal. But, in the case for which the preceding theory is con- 
structed, each result Xi may have any one of the weights ^, A|, etc. ; and the 
equation (14) determines a certain mean among these weights which we are to 
assign to Xf . The coefficients W are functions of ^ , and admit of being tabu- 
lated as such in any special case. 

In what precedes we have presupposed no difference of weights among the 
results Xi . . . . Xm to be known a priori: But, since each observation may have 
any one of the weights ,Ai . . . . ^„, a certain mean weight w of each is deter- 
mined a posteriori, as a function of its deviation from the general mean, by the 
equations (14). This mean weight can be tabulated as a function of |, and thus 
taken out from a table with a single argument. 

If, however, we have some knowledge of an observation which leads us to 
assign it one precision rather than another, we may utilize this knowledge so 
as to modify the values of wj*'. If Ai, h^, etc., are taken in the descending order 
of magnitude, then hi will be the weight of each observation of the best class. 
The theoretically best mode of dealing with such cases will, however, depend upon 
the circumstances of that case. Simplicity is so important an advantage that it 
will probably be found well to adopt the rule of replacing W by its product by 
the weight fixed from a priori considerations. 

§ 9. Application to Transits of Mercury over the Sun's Disc. 

I now propose to apply the preceding theory to the case of observed con- 
tacts of the inferior planets. Mercury and Venus, with the limb of the Sun. A 
peculiarity of the observations of these phenomena is the great number of them 
which investigators have had to reject on account of discordance of individual 
observations from the general mean. By suitable rejection very different final 
results may be obtained, and it is impossible to draw any line between those 
observations which should be rejected and those which should be retained. 

In my discussion of observations of transits of Mercury,* I have shown 
that the residuals of 684 observations of the interior contact of the limbs of the 
Sun and Mercury are distributed as follows, the value of each residual being 
considered only to the nearest round 5 seconds : 

* Astronomical Papers oi the American Ephemeris, Vol. I. 
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Below — 27 seconds were 20 residuals. 





s. 


s. 


s. 


s. s. 




of —27, 


-26, 


-25, 


— 24, — 23 were 11 




' —22, 


-21, 


-20, 


— 19, —18 ' 


15 




' -17, 


-16, 


-15, 


— 14, —13 ' 


44 




' -12, 


-11, 


-10, 


-9,-8 ' 


77 




' - 7, 


- 6, 


- 5, 


-4,-3 ' 


' 132 




' - 2, 


- 1, 


- 0, 


+ 1, + 2 ' 


' 147 




' + 3, 


+ 4, 


+ 5, 


+ 6, + 7 ' 


89 




' + 8, 


+ 9, 


+ 10, 


+ 11, +12 ' 


52 




' + 13, 


+ 14, 


+ 15, 


+ 16, +17 ' 


33 




' +18, 


+ 19, 


+ 20, 


+ 21, +22 ' 


23 




' +23, 


+ 24, 


+ 25, 


+ 26, + 27 ' 


12 


Exceedin 


g + 27 


, ^ 


• * • 




29 



Collecting those of equal absolute value and classifying them according to 
the middle value of each group, we have the following comparison of actual and 
probable numbers, the latter being obtained on the usual theory, assuming a 

s. J s. 

probable error of ± 6.67, or a value of — = 14.0. 



Mfiati Values 
of Residual. 


Actual 
Number. 


n 

Probable 
Number. 


A — P. 


s. 



147 


137 


+ 10 


5 


221 


240 


— 19 


10 


129 


166 


— 37 


15 


77 


88 


— 11 


20 


38 


36 


+ 2 


25 


23 


12 


+ 11 


>27 


49 


5 


+ 44 



There is, therefore, a large excess of both small and large residuals, which would 
have been yet more pronounced had the mean error been determined from the 
sum of the squares of all the residuals. 

I find, by several trials, that the residuals which do not exceed 27 can be 
well represented by the following distribution of precisions and probable errors : 

s. s. 

110 observations of 1 :^ = 6 or probable error =2.9, 
100 
400 
60 



(1 


10 " 


II 


u 


4.8, 


II 


18 " 


II 


II 


8.6, 


It 


36 " 


II 


II 


17.2. 
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The comparison of actual and probable numbers of residuals will then be as follows : 



s. 


Actual. 


Probable. 


A — P. 





147 


143 


+ 4 


5 


221 


220 


+ 1 


10 


129 


128 


+ 1 


15 


77 


76 


+ 1 


20 


38 


44 


— 6 


25 


23 


23 





>27 


49 


26 


+ 23 



It must be understood that these four values of 1 :^ and of the consequent 
probable errors are not four entirely determinate quantities. Really we should 
consider that the precision has all values between the extreme limits ; but it is not 
at all necessary to consider it as what it really is, a continuously varying quantity. 
All we have to do is to form an expression which shall represent the relation 
between the number and magnitude of the residuals; and this we do most 
conveniently by assuming three, four or more separate values of A, and then 
finding how many observations we have to assign to each class in order to 
represent the observed relation. From the above table we may infer that about 
one-third the observations of transits of Mercury belong to classes which might 

8. 8. 

be called good or very good, the probable error ranging from 2^ to 6 ; that more 
than half belong to an average class, of which the probable error may range 
from 6 to 1 2 seconds, and that about one-twelfth are made under such unfavor- 

able circumstances that their probable error averages 17. Even with this large 
probable error, we see that there is an excess of 23 residuals exceeding 27 
seconds, so that we should have increased the number of this imperfect class. 
But I suspect that many of these arose from errors of a minute in the record, 
or from other pure blunders. 

I am also inclined to think that the comparative excess of very small 
residuals, indicating that one-fifth of all the observations had a probable error as 

s. 

small as 3, may be partly due to the fact that many of the residuals are devia- 
tions from the mean of a small number of observations, and that no comparison 
of the separate observations with the final theory founded on the whole mass of 
observations was made. On the whole, we may suppose that of the actual 

' . 30 have a precision A^ = 1 : 10 , 

0.60 " " " ^=1:18, 
0,10 " " " ^=1:36. 
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This will give for the law of probability [Eq. 5) : 

s. 

^{x) = -i-|.030e~^^^V .0333 e~^^^\ .00277 e"^^*H. 

We have, for the values of Wj, w^ and Wg, as given by (13), 

Wi = .03000e~^"^ \h\ = l: 100, 

W2 = .03333e ^•«'';A|=1: 324, 

W3 = .00277e~^^^; A|= 1:1296. 

In the following table the quantities required are tabulated as a function of 
the residual ^ of an observation. The w's are multiplied by 1,000, and the TF's 
by 10,000, so as to express them in convenient units : 

f Wi w, w, W 






30.0 


33.3 


2.8 


6.1 


2 


28.8 


32.9 


2.8 


6.1 


4 


25.6 


31.7 


2.7 


5.9 


6 


20.9 


29.8 


2.7 


5.7 


8 


15.8 


27.4 


2.6 


5.3 


10 


11.0 


24.5 


2.6 


4.9 


12 


7.1 


21.4 


2.5 


4.5 


14 


4.2 


18.2 


2.4 


4.0 


16 


2.3 


15.1 


2.3 


3.6 


18 


1.2 


12.3 


2.2 


3.3 


20 


0.55 


9.70 


2.04 


3.0 


22 


0.23 


7.49 


1.91 


2.8 


24 


0.09 


5.63 


1.78 


2.6 


26 


0.03 


4.14 


1.65 


2.5 


28 


0.01 


2.96 


1.52 


2.3 


30 


0.00 


2.07 


1.39 


2.2 



If we could be sure that any one observation belonged to the best class, 
its weight on the above scale would be 10 ; were we sure it belonged to the 
intermediate class, its weight would be 3, and if it belonged to the worst, class, 
it would be 0.77. The value of TF for ^ = 0, namely, 6.1, falls below 10 in 
consequence of the probability that an observation of residual zero may belong 
to one of the poorer classes, and the value of W for an observation of residual 
Vol. vm. 
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a. 

30 is above 0.77 on account of the possibility that such an observation may 
belong to the intermediate class. 

§ 10. Approximation Expression for the Evil. 

It remains to find an expression for the evil of the best result, as obtained 
by the preceding method. As already defined and shown, if the probability 
that the value of the observed quantity is contained between the limits x and 
x + dxhe 6xdx , 

then the evil of any assumed value x^ of the required quantity is given by the 

*^"* E=J (x — XoYOxdx. 

In the case of m observed values oi x, Xi, x^, . . . . x„, following the general 
law of error, we have 

6x = a4> {xi — x)^{a^ — x) . . . . ^ {x^ — x), 
the coefficient a being a constant, determined by the condition 

Oxdx = 1 . 

If we take for Xf, the best value of x, namely, the value which satisfies the 

condition /•+°° ^ , 

Xo= I xvxax , 

^ DO 

we have for its evil E= T^x^exck- xl 

It will be noticed that the function 6x differs from 4'(«) in (6) only in containing 
the factor a; that is, we have 

Ox = a^x) = a2Pe-*'*' + ^*''-<'. 

In consequence of a we may omit from 4^ any constant factor, as /if it. Then 
from 

(9) 4- (>7) = 2Pe-*^''' +«»"-'' 

we have fy^,)a. = XP^^ e^^ 

a is determined by the condition 

/ + 00 
4'{x)dx= 1 = aSw . 
_-oo 
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Comparing with the equations (9) and (11), we find 



/: 



1 y W 



the sign 2 extending to the n^ distributions of the measures of precision. We 
thus have, for the minimum evil, 

1 -y w ^ w 



{i= 1,2 n""— l;y=i + l, n™) (15) 

The second term of the right-hand member of this equation is a certain mean 

among the various values of ^ , and coincides with the square of the " mean 

error " of the usual theory, which, as already shown, coincides with the " evil," 
as that term is here defined. If there is but one distribution of the A's among the 
a;'s, then there will be but one value oi vi, and the first term of the evil will vanish, 
so that we shall have no evil left except the "mean error." But when, as here 
supposed, the weights of the observations are themselves uncertain, then the last 
equation expresses the logical conclusion that, in order to obtain the total evil, 
we must add to the result of the mean uncertainty of the observations a quantity 
depending upon the uncertainty in the weights we should individually or collec- 
tively assign to them. 

The first term of (15) comprises ^^^~~ ^ terms (^ = «""); its actual compu- 

tation is therefore out of the question. We may, however, remark that it admits 
of being expressed in the form 

2aij(a;i — a;/, 

ai J being numerical coefficients. This form contains only — ^r — - terms. 

To show this, we remark that each value of y; toay be written in the form 

where i>i +jPa + i>3 + • • • • +i>m= L 

The difference of any two values of y; multiplied by any factor, such as ,^*^^ » 
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will therefore be of the form 

where ^i + (h + (is + +i"m = 0, (16) 

and where each of the coefficients ^i, fi^, etc., takes — ^r • values, correspond-. 

ing to the number of differences of the t^'s. The sum of the squares of these 
differences will be 

x\Xi4 + 4^l4 + + «« Vm 

+ 2x1X^X111^1% + 2xiXsX{ii(i3 + 2x2X^ii^(is + , etc. 

On account of the condition (16) this expression may be transformed into 

X^A^ (xt — xj)\ 
where Ai^j=. — X^i^j . 

Here the sign 2 of summation extends to all the ^ products of fi having 

the same constant suffixes * andy. 

Although the value of this expression admits of rigorous algebraic deter- 
mination, it is doubtful whether there would be any advantage in computing it 
in any special case. I shall therefore seek only for a rough approximation to 
its probable value. Returning to the equation (15), we first note that the 
expression 

^^"^^"0^^ (*■= 1. 2 n-,; j = i+l,.... n-) 

is one-half a weighted mean value of (07^ — yijf, the weight of each being the 
Tprodnct w^Wj, and the zero terms (>?< — riif being allowed to enter with half 
weight in taking the mean. Instead of this weighted mean we may take the 
general mean formed by giving all the dijfferences equal weight. 

Now, when the number of observations treated is large, we may consider 
the amount by which any one value of r; differs from the mean of all the values, 
or a; , to be the result of an accumulation of accidental errors ; namely, if we put 

§- = 2. (17) 

(where hi means the precision assigned to Xt), we shall have 

while we have for x an expression of the form 

X = qixi -I- ^a'ajg + . . . . + qLx^, 
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qi being the weighted mean of all the ra™ values of qi . If we now put, as before, 
|j for the deviation of Xi from the general mean x, we have, on account of 
Xqt = Xqi = 1 , 

>7 — »=(?!— ?i)^i + fe— 20^2+ + iqm — qL)^m' 

Now, since ^ is, in all cases, the sum of some m values of 7i*, the mean 

value of q given by (17) is — . The actual special values can never reach zero 

2 

as their lower limit, and will seldom exceed — as their upper limit. The range 

m 

of value will, however, depend upon the range of values taken by the pre- 
cisions h. Unless in extreme cases, the mean deviation of q from ■^— cannot 

exceed — ; that is, the mean value of 5' — g^' will, in general, be less than — . 

Assigning this mean value, we may regard ri — a; as made up of the probable 
accumulation of terms 

If we put A^ for the mean value of ^*, then, by the theory of errors, we shall 

have, for the mean value of (>? — x)^, 

A* 
mean (^ — ajf = ^ • 

Hence mean {rn - nif = ^ = S* ' 

To compare this with the last term of (15), let us suppose the most probable 
distribution of precisions with respect to their magnitude to be 

m-i precisions of value h^; 



We shall then have as a close approximation to the mean value of ¥, 

Ti = wiiAf + mji\ + . . . . mji\, , 
one-half the reciprocal of which may be taken for the last term of (15).' Thus 
we may take for the amount of the evil, in all ordinary cases 

2 wiiAi + . . . . -f- w„A^ 2m 
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As already shown, this evil will be the square of the "mean error" to be 
expected of the usual theory ; so that we may take 

as the mean error to be expected. 

I remark, in conclusian, that this theory and method may be extended to 
the case of several unknown quantities without any other difficulty than that of 
a resolution of the equations of condition with the a posteriori weights. We 
should first solve the equations if necessary, using equal weights for all, or such 
system of weights as might be deemed most probable. From the residuals thus 
obtained we should deduce the law of error, and in practice we should, in order 
to determine such law, combine with the residuals in question all others that 
astronomy could furnish pertaining to the same class of observations. Then 
we should re-solve the equations using the modified weights, which re-solution 
would give us the definitive result. 



